首页> 外文OA文献 >Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis
【2h】

Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis

机译:基于DNN的文本到语音合成的超音段特征的音节级表示

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

A top-down hierarchical system based on deep neural networks is investigated for the modeling of prosody in speech synthesis. Suprasegmental features are processed separately from segmental features and a compact distributed representation of high-level units is learned at syllable-level. The suprasegmental representation is then integrated into a frame-level network. Objective measures show that balancing segmental and suprasegmental features can be useful for the frame-level network. Additional features incorporated into the hierarchical system are then tested. At the syllable-level, a bag-of-phones representation is proposed and, at the word-level, embeddings learned from text sources are used. It is shown that the hierarchical system is able to leverage new features at higher-levels more efficiently than a system which exploits them directly at the frame-level. A perceptual evaluation of the proposed systems is conducted and followed by a discussion of the results.
机译:研究了基于深度神经网络的自上而下的分层系统,用于语音合成中的韵律建模。超分段特征与分段特征分开处理,并且在音节级学习了高级单元的紧凑分布式表示。然后将超分割表示集成到帧级网络中。客观测量表明,分段和超分段特征的平衡对于帧级网络可能是有用的。然后测试合并到分层系统中的其他功能。在音节级别,提出了一个电话袋表示法,在单词级别,提出了从文本源学习到的嵌入。结果表明,与直接在帧级别利用新功能的系统相比,分层系统能够更有效地利用更高级别的新功能。对提议的系统进行感知评估,然后讨论结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号